HMM based TTS for mixed language text
نویسندگان
چکیده
When synthesizing Chinese text mixed with English text, it is usually preferred to synthesize the mixed languages content with a single voice. However the synthesized English of HMM based TTS may sound unnatural if the models are directly built with a Chinese speakers’ unprofessional English data. In this paper, we propose to use MLLR speaker adaptation method to leverage a native English speaker’s model to generate more natural English for the Chinese speaker. Adapted F0 model and spectrum model are used together with original English speaker’s duration models for a better prosody. In synthesis stage, mixed language contents share a unified prosody tree to improve the continuity between Chinese and English contents. Evaluation results show that the proposed method significantly improve the speaker consistency and naturalness of synthesized speech for mixed language text compared to using directly built models.
منابع مشابه
Performance Analysis of Text To Speech Synthesis System Using HMM And Prosody Features With Parsing For Tamil Language
This paper describes a Hidden Markov Model (HMM) based (TTS) system and prosody based (TTS) system for producing natural sounding synthetic speech in Tamil language. The (HMM) based system consists of two phases such as training and synthesis. Tamil speech is first parameterized into spectral and excitation features using Glottal Inverse Filtering (GIF). An emotions present in the input text is...
متن کاملDevelopment of HMM-based Malay Text-to-Speech System
This paper presents the development of a hidden Markov model (HMM)-based Malay text-to-speech (TTS) system. To our knowledge, this is the first report on the development of the HMM-based speech synthesis system for the Malay language. In this paper, We first discuss the Malay speech characteristics, specifically, on Malay phonological system and syllable structure. In the Malay phonological sys...
متن کاملLinguistic and mixed excitation improvements on a HMM-based speech synthesis for Castilian Spanish
Hidden Markov Models based text-to-speech (HMM-TTS) synthesis is one of the techniques for generating speech from trained statistical models where spectrum and prosody of basic speech units are modelled altogether. This paper presents the advances in our Spanish HMM-TTS and a perceptual test is conducted to compare it with an extended PSOLA-based concatenative (E-PSOLA) system. The improvements...
متن کاملAn HMM-based bilingual (Mandarin-English) TTS
We propose to build an HMM-based, Mandarin and English, bilingual TTS system. Starting with a simple baseline of two TTS systems built separately from Mandarin and English databases recorded by the same speaker, we construct a new, mixed-language TTS by designing language specific and independent questions to facilitate phone sharing across the two languages. With shared phones, the new system ...
متن کاملAn Open Source HMM-based Text-to-Speech System for Brazilian Portuguese
Text-to-speech (TTS) is currently a mature technology that is used in many applications. Some modules of a TTS depend on the language and, while there are many public resources for English, the resources for some underrepresented languages are still limited. This work describes the development of a complete TTS system for Brazilian Portuguese which expands the already available resources. The s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010